Distributed Intersection Join of Complex Interval Sequences

نویسندگان

  • Hans-Peter Kriegel
  • Peter Kunath
  • Martin Pfeifle
  • Matthias Renz
چکیده

In many different application areas, e.g. space observation systems or engineering systems of world-wide operating companies, there is a need for an efficient distributed intersection join in order to extract new and global knowledge. A solution for carrying out a global intersection join is to transmit all distributed information from the clients to a central server leading to high transfer cost. In this paper, we present a new distributed intersection join for interval sequences of high-cardinality which tries to minimize these transmission cost. Our approach is based on a suitable probability model for interval intersections which is used on the server as well as on the various clients. On the client sites, we group intervals together based on this probability model. These locally created approximations are sent to the server. The server ranks all intersecting approximations according to our probability model. As not all approximations have to be refined in order to decide whether two objects intersect, we fetch the exact information of the most promising approximations first. This strategy helps to cut down the transmission cost considerably which is proven by our experimental evaluation based on synthetic and real-world test data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequenced Subset Operators: Definition and Implementation

Difference, intersection, semi-join and anti-semi-join may be considered binary subset operators, in that they all return a subset of their left-hand argument. These operators are useful for implementing SQL’s EXCEPT, INTERSECT, NOT IN and NOT EXISTS, distributed queries and referential integrity. Difference-all and intersection-all operate on multi-sets and track the number of duplicates in bo...

متن کامل

Size Estimation of the Intersection Join between Two Line Segment Datasets

In this paper we provide a theoretical framework for estimating the size of the intersection join between two line segment datasets (e.g., roads, railways, utilities). For real datasets, it has been pointed out that the line segment lengths and slopes are distributed according to specific mathematical laws [14]. Starting from this result, we show how to predict the size of the intersection join...

متن کامل

Distributed Statistical Estimation of Matrix Products with Applications

We consider statistical estimations of a matrix product over the integers in a distributed setting, where we have two parties Alice and Bob; Alice holds a matrix A and Bob holds a matrix B, and they want to estimate statistics of A · B. We focus on the well-studied `p-norm, distinct elements (p = 0), `0-sampling, and heavy hitter problems. The goal is to minimize both the communication cost and...

متن کامل

Efficient Join Processing for Complex Rasterized Objects

One of the most common query types in spatial database management systems is the spatial intersection join. Many state-of-the-art join algorithms use minimal bounding rectangles to determine join candidates in a first filter step. In the case of very complex spatial objects, as used in novel database applications including computer-aided design and geographical information systems, these one-va...

متن کامل

The Join Levels of the Trotter-Weil Hierarchy Are Decidable

The variety DA of finite monoids has a huge number of different characterizations, ranging from two-variable first-order logic FO to unambiguous polynomials. In order to study the structure of the subvarieties of DA, Trotter and Weil considered the intersection of varieties of finite monoids with bands, i.e., with idempotent monoids. The varieties of idempotent monoids are very well understood ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005